1,226 research outputs found

    Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language

    Full text link
    This article presents a measure of semantic similarity in an IS-A taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their effectiveness

    PowerAqua: fishing the semantic web

    Get PDF
    The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources

    Metal silicide/poly-Si Schottky diodes for uncooled microbolometers

    Get PDF
    Nickel silicide Schottky diodes formed on polycrystalline Si films are proposed as temperature sensors of monolithic uncooled microbolometer IR focal plane arrays. Structure and composition of nickel silicide/polycrystalline silicon films synthesized in a low-temperature process are examined by means of transmission electron microscopy. The Ni silicide is identified as multi-phase compound composed by 20 to 40% of Ni3Si, 30 to 60% of Ni2Si and 10 to 30% of NiSi with probable minor content of NiSi2 at the silicide/poly-Si interface. Rectification ratios of the Schottky diodes vary from ~100 to ~20 for the temperature increasing from 22 to 70C; they exceed 1000 at 80K. A barrier of ~0.95 eV is found to control the photovoltage spectra at room temperature. A set of barriers is observed in photo-emf spectra at 80K and attributed to the Ni-silicide/poly-Si interface. Absolute values of temperature coefficients of voltage and current are found to vary from 0.3 to 0.6%/K for forward biasing and around 2.5%/K for reverse biasing of the diodes.Comment: 18 pages, 7 figure

    Island Fox Spatial Ecology and Implications for Management of Disease

    Get PDF
    Disease, predation, and genetic isolation resulted in 4 of 6 island fox (Urocyon littoralis) subspecies being listed as endangered in 2004. Potential for disease outbreaks continues to pose a major threat to the persistence of these isolated, endemic populations. We examined how roads influence the spatial ecology of San Clemente Island foxes (U. l. clementae), particularly in regard to spread of disease, to provide management recommendations for preventing or minimizing a disease outbreak on San Clemente Island, California, USA. Home range areas (x=0.75 km2) and core areas (x=0.19 km2) of foxes on San Clemente Island were 0.36–1.23 and 2.17 times larger, respectively, than estimates from Santa Cruz Island foxes (U. l. santacruzae). Home ranges and core areas were 78% larger and 73% larger, respectively, for foxes near roads than for foxes away from roads. Home ranges were also largest when foxes were not caring for offspring (i.e., seasons of pup-independence and breeding). We did not detect any dispersal movements, but foxes living near roads moved 33% farther in 2-hour periods than foxes not living near roads. Foxes near roads move faster, range more widely, and could more rapidly spread a pathogen throughout the island; therefore, roads might serve as transmission corridors.We recommend reducing this risk by increasing widths of vaccination firewalls (areas where vaccination is used to induce a disease-resistant or immune population of foxes), ensuring these areas deliberately intersect roads, and vaccinating a higher proportion of foxes living near roads. Disease risk models incorporating these strategies could inform the lowest risk scenarios

    GOSim – an R-package for computation of information theoretic GO similarities between terms and gene products

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the increased availability of high throughput data, such as DNA microarray data, researchers are capable of producing large amounts of biological data. During the analysis of such data often there is the need to further explore the similarity of genes not only with respect to their expression, but also with respect to their functional annotation which can be obtained from Gene Ontology (GO).</p> <p>Results</p> <p>We present the freely available software package <it>GOSim</it>, which allows to calculate the functional similarity of genes based on various information theoretic similarity concepts for GO terms. <it>GOSim </it>extends existing tools by providing additional lately developed functional similarity measures for genes. These can e.g. be used to cluster genes according to their biological function. Vice versa, they can also be used to evaluate the homogeneity of a given grouping of genes with respect to their GO annotation. <it>GOSim </it>hence provides the researcher with a flexible and powerful tool to combine knowledge stored in GO with experimental data. It can be seen as complementary to other tools that, for instance, search for significantly overrepresented GO terms within a given group of genes.</p> <p>Conclusion</p> <p><it>GOSim </it>is implemented as a package for the statistical computing environment <it>R </it>and is distributed under GPL within the CRAN project.</p

    Zero-shot language transfer for cross-lingual sentence retrieval using bidirectional attention model

    Get PDF
    We present a neural architecture for cross-lingual mate sentence retrieval which encodes sentences in a joint multilingual space and learns to distinguish true translation pairs from semantically related sentences across languages. The proposed model combines a recurrent sequence encoder with a bidirectional attention layer and an intra-sentence attention mechanism. This way the final fixed-size sentence representations in each training sentence pair depend on the selection of contextualized token representations from the other sentence. The representations of both sentences are then combined using the bilinear product function to predict the relevance score. We show that, coupled with a shared multilingual word embedding space, the proposed model strongly outperforms unsupervised cross-lingual ranking functions, and that further boosts can be achieved by combining the two approaches. Most importantly, we demonstrate the model's effectiveness in zero-shot language transfer settings: our multilingual framework boosts cross-lingual sentence retrieval performance for unseen language pairs without any training examples. This enables robust cross-lingual sentence retrieval also for pairs of resource-lean languages, without any parallel data

    Propagating user interests in ontology-based user model

    Get PDF
    In this paper we address the problem of propagating user interests in ontology-based user models. Our ontology-based user model (OBUM) is devised as an overlay over the domain ontology. Using ontologies as the basis of the user profile allows the initial user behavior to be matched with existing concepts in the domain ontology. Such ontological approach to user profiling has been proven successful in addressing the cold-start problem in recommender systems, since it allows for propagation from a small number of initial concepts to other related domain concepts by exploiting the ontological structure of the domain. The main contribution of the paper is the novel algorithm for propagation of user interests which takes into account i) the ontological structure of the domain and, in particular, the level at which each domain item is found in the ontology; ii) the type of feedback provided by the user, and iii) the amount of past feedback provided for a certain domain object

    Statistical methods in language processing

    Full text link
    The term statistical methods here refers to a methodology that has been dominant in computational linguistics since about 1990. It is characterized by the use of stochastic models, substantial data sets, machine learning, and rigorous experimental evaluation. The shift to statistical methods in computational linguistics parallels a movement in artificial intelligence more broadly. Statistical methods have so thoroughly permeated computational linguistics that almost all work in the field draws on them in some way. There has, however, been little penetration of the methods into general linguistics. The methods themselves are largely borrowed from machine learning and information theory. We limit attention to that which has direct applicability to language processing, though the methods are quite general and have many nonlinguistic applications. Not every use of statistics in language processing falls under statistical methods as we use the term. Standard hypothesis testing and experimental design, for example, are not covered in this article. WIREs Cogni Sci 2011 2 315–322 DOI: 10.1002/wcs.111 For further resources related to this article, please visit the WIREs websitePeer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/83468/1/111_ftp.pd

    Identification of disease-causing genes using microarray data mining and gene ontology

    Get PDF
    Background: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes. Methods: We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results. Results: The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth. Conclusions: The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers
    corecore